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Abstract 

We  consider  two-player  infinite  games  played  on  graphs. 
The  games  are  concurrent,  in  that  at  each  state  the 
players  choose  their  moves  simultaneously  and  indepen¬ 
dently,  and  stochastic,  in  that  the  moves  determine  a 
probability  distribution  for  the  successor  state.  The 
value  of  a  game  is  the  maximal  probability  with  which 
a  player  can  guarantee  the  satisfaction  of  her  objective. 
We  show  that  the  values  of  concurrent  games  with  uj- 
regular  objectives  expressed  as  parity  conditions  can  be 
decided  in  NP  D  coNP.  This  result  substantially  im¬ 
proves  the  best  known  previous  bound  of  3EXPTIME. 
It  also  shows  that  the  full  class  of  concurrent  parity 
games  is  no  harder  than  the  special  case  of  turn-based 
stochastic  reachability  games,  for  which  NP  n  coNP  is 
the  best  known  bound. 

While  the  previous,  more  restricted  NP  n  coNP  re¬ 
sults  for  graph  games  relied  on  the  existence  of  partic¬ 
ularly  simple  (pure  memoryless)  optimal  strategies,  in 
concurrent  games  with  parity  objectives  optimal  strate¬ 
gies  may  not  exist,  and  £-optimal  strategies  (which 
achieve  the  value  of  the  game  within  a  parameter  e  >  0) 
require  in  general  both  randomization  and  infinite  mem¬ 
ory.  Hence  our  proof  must  rely  on  a  more  detailed 
analysis  of  strategies  and,  in  addition  to  the  main  re¬ 
sult,  yields  two  results  that  are  interesting  on  their  own. 
First,  we  show  that  there  exist  e-optimal  strategies  that 
in  the  limit  coincide  with  memory  less  strategies;  this 
parallels  the  celebrated  result  of  Mertens-Neyman  for 
concurrent  games  with  limit-average  objectives.  Sec¬ 
ond,  we  complete  the  characterization  of  the  memory  re¬ 
quirements  for  £-optimal  strategies  for  concurrent  games 
with  parity  conditions,  by  showing  that  memory  less 
strategies  suffice  for  £-optimality  for  coBiichi  conditions. 
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1  Introduction 

We  consider  infinite  recursive  games  played  between 
two  players  over  a  graph  [23,  10,  17].  The  games  proceed 
in  an  infinite  number  of  rounds.  In  each  round,  the 
players  choose  moves;  the  two  moves,  together  with 
the  current  state,  determine  a  probability  distribution 
for  the  successor  state.  An  outcome  of  the  game, 
or  a  play,  consists  of  the  infinite  sequence  of  states 
visited.  These  graph  games  can  be  broadly  classified 
into  turn-based  and  concurrent  games.  In  turn-based 
games,  in  any  given  round  only  one  player  can  choose 
among  multiple  moves:  effectively,  the  set  of  states  of 
the  graph  can  be  partitioned  into  the  states  where  it 
is  player  I’s  turn  to  play,  and  the  states  where  it  is 
player  2’s  turn  to  play.  In  concurrent  games,  both 
players  may  have  multiple  moves  available  at  each  state, 
and  the  players  choose  their  moves  simultaneously  and 
independently.  Concurrent  games  provide  a  natural 
framework  to  model  reactive  systems  with  synchronous 
interactions  [1]. 

An  important  class  of  winning  conditions  are  the 
oj-regular  languages.  In  such  games,  the  goal  of  player  1 
is  to  ensure  that  the  play  belongs  to  a  specified  uj- 
regular  language;  the  goal  of  player  2  is  to  ensure 
that  the  play  does  not  belong  to  the  language.  The 
games  are  thus  zero-sum:  the  objectives  of  the  two 
players  are  complementary.  The  w-regular  languages 
are  the  generalization  to  infinite  words  of  the  classical 
regular  languages  [25];  the  properties  expressible  by 
w-regular  languages  include  safety,  reachability,  and 
fairness.  Games  with  w-regular  winning  conditions 
have  been  applied  to  system  synthesis  [3,  22,  20]  and 
verification  [9,  1].  Of  particular  interest  are  w-regular 
languages  that  are  given  as  parity  conditions  on  game 
graphs;  this  is  because  every  w-regular  game  can  be 
converted  into  a  parity  game  [19,  26].  Hence  concurrent 
games  with  parity  conditions  provide  an  adequate  model 
for  the  synthesis  of  synchronous  reactive  systems. 

Given  a  recursive  game  and  an  w-regular  lan¬ 
guage  C,  the  value  {{l))vai{iC){s)  of  the  game  for  player  1 
at  a  state  s  is  equal  to  the  maximal  probability  with 
which  player  1  can  ensure  that  the  play  lies  in  C;  the 
value  ((2))„a;(£)(s)  of  the  game  for  player  2  at  s  is  equal 
to  the  maximal  probability  with  which  player  2  can  en¬ 
sure  that  the  play  lies  outside  £.  Martin’s  determinacy 
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theorem  ensures  that  ((l))„oi(^)(s)  +  ((2))„q;(£)(s)  =  1 
[15].  Except  for  the  special  case  of  turn-based  games, 
little  has  been  known  about  the  computational  complex¬ 
ity  of  finding  the  value  for  a  recursive  game  with  an 
w-regular  winning  condition.  In  the  turn-based  case, 
it  is  known  that  the  value  of  games  with  parity  condi¬ 
tions  can  be  computed  in  NP  n  coNP.  This  result  was 
obtained  for  turn-based  deterministic  parity  games,  in 
which  each  move  determines  uniquely  (instead  of  prob¬ 
abilistically)  the  successor  state,  in  [9],  and  for  turn- 
based  stochastic  reachability  games  in  [6];  the  case  of 
turn-based  stochastic  parity  games  was  shown  in  [4] . 

Concurrent  games  are  substantially  more  complex 
than  turn-based  games  in  several  respects.  To  see  this, 
consider  the  structure  of  optimal  strategies,  which  are 
strategies  that  achieve  the  value  of  a  given  game.  For 
turn-based  stochastic  w-regular  games,  there  always 
exist  pure  (deterministic)  optimal  strategies,  which  do 
not  rely  on  randomized  choice  [4,  16];  in  the  case  of 
turn-based  stochastic  parity  games,  moreover,  there  are 
always  pure  memoryless  optimal  strategies,  where  the 
choice  of  move  depends  only  on  the  current  state,  rather 
than  also  on  the  past  history  of  the  game.  It  is  this 
observation  that  led  to  the  NP  n  coNP  results  for  turn- 
based  parity  games. 

By  contrast,  in  concurrent  games,  already  for  reach¬ 
ability  conditions,  players  must  in  general  play  with  ran¬ 
domized  (non-pure)  strategies,  which  prescribe,  in  each 
round,  a  probability  distribution  over  the  moves  to  be 
played.  Furthermore,  optimal  strategies  may  not  ex¬ 
ist:  rather,  for  every  real  e  >  0,  the  players  have  e- 
optimal  strategies,  which  achieve  the  value  of  the  game 
within  s.  Even  for  relatively  simple  parity  winning  con¬ 
ditions,  such  as  Biichi  conditions,  e-optimal  strategies 
need  both  randomization  and  infinite  memory  [8].  It 
is  therefore  not  inconceivable  that  the  complexity  of 
concurrent  parity  games  might  be  considerably  worse 
than  NP  n  coNP.  The  only  known  previous  algorithm 
for  computing  the  value  of  concurrent  parity  games  is 
triple-exponential  [8] :  it  was  obtained  via  a  reduction  to 
the  theory  of  the  real  closed  Helds,  and  then  using  deci¬ 
sion  procedures  for  the  theory  of  reals  with  addition  and 
multiplication.  [24,  2].  Even  for  the  simpler  Biichi  win¬ 
ning  conditions  the  previously  known  complexity  was 
EXPTIME  [8]. 

In  this  paper,  we  show  that  the  problem  of  com¬ 
puting  the  value  of  a  concurrent  parity  game  is  in  NP 
n  coNP.  More  precisely,  as  the  value  of  a  concurrent 
game  at  a  state  can  be  an  irrational  number,  we  show 
that  given  an  encoding  of  the  game,  and  a  rational  r, 
for  all  rationals  £  >  0,  whether  the  value  of  the  game 
is  in  the  interval  [r  —  £,  r  -|-  e]  can  be  decided  in  NP 
n  coNP.  This  result  generalizes  the  best  known  upper 


bound  (NP  D  coNP)  for  very  restricted  cases,  such  as 
turn-based  deterministic  parity  games  and  turn-based 
stochastic  reachability  games,  to  the  class  of  all  concur¬ 
rent  parity  games. ^ 

The  basic  idea  behind  the  proof,  which  can  no 
longer  rely  on  the  existence  of  pure  memoryless  optimal 
strategies,  is  as  follows.  We  call  a  value  class  the  set  of 
states  where  the  game  has  the  same  value  for  player  I. 
By  the  results  of  [7]  on  qualitative  winning  (i.e.,  winning 
with  probability  I),  if  the  (player  1)  value  of  the  game 
is  not  constant  1  or  0,  then  there  are  two  non-empty 
value  classes  Wi  and  W2  where  the  value  is  1  and  0, 
respectively.  We  show  that  if  the  players  play  e-optimal 
strategies,  then  Wi  U  W2  is  reached  with  probability  1. 
Through  a  detailed  analysis  of  the  branching  structure 
of  the  stochastic  process  of  the  game,  we  go  on  to  show 
that  we  can  construct  an  e-optimal  strategy  by  stitching 
together  strategies,  one  per  each  value  class.  This  gives 
us  a  polynomial  witness  for  the  resulting  strategy  and 
proves  membership  in  NP.  Membership  in  NP  n  coNP 
follows  from  the  fact  that  the  problem  is  symmetric  in 
players  1  and  2. 

A  detailed  analysis  of  our  proof  gives  us  several 
new  results  about  the  structure  of  e-optimal  strategies 
in  concurrent  parity  games.  First,  we  show  that  con¬ 
current  games  with  coBiichi  winning  conditions  admit 
memoryless  e-optimal  strategies.  This  result  completes 
the  characterization  of  the  memory  requirements  of  the 
e-optimal  strategies  for  concurrent  w-regular  games: 
it  was  previously  known  that  safety  and  reachability 
games  admit  memoryless  e-optimal  strategies  [II,  8], 
and  that  Biichi  conditions  may  require  infinite  mem¬ 
ory  [8].  Second,  we  show  that  in  concurrent  parity 
games,  the  limit  of  the  e-optimal  strategies  for  e  ^  0  is  a 
memoryless  strategy  (which  in  general  is  not  optimal). 
This  result  parallels  the  celebrated  result  of  Mertens- 
Neyman  [18]  for  concurrent  games  with  limit-average 
objectives. 

2  Definitions 

Notation.  For  a  countable  set  A,  a  probability  dis¬ 
tribution  on  A  is  a  function  <5  :  A  ^  [0, 1]  such  that 
denote  the  set  of  probability  distri¬ 
butions  on  A  by  'D{A).  Given  a  distribution  5  €  T){A), 
we  denote  by  Supp((5)  =  {x  €  A  \  S(x)  >  0}  the  support 
of  (5. 

Definition  2.1.  (Concurrent  Game  Structures) 
A  (two-player)  concurrent  game  structure 

^For  turn-based  deterministic  parity  games  a  bound  of  UP  Cl 
coUP  is  also  known  [12],  but  for  turn-based  stochastic  reachability 
and  turn-based  stochastic  parity  games  NP  fl  coNP  is  the  best 
known  bound. 


G  =  {S,  M ,Ti,T2,6)  consists  of  the  following  compo¬ 
nents: 

•  A  finite  state  space  S  and  a  finite  set  M  of  moves. 

•  Two  move  assignments  ri,r2  :  S  2^  \  0.  For 
i  €  {1,2},  the  move  assignment  associates  with 
each  state  s  €  S  the  non-empty  set  Ti{s)  C  M  of 
moves  available  to  player  i  at  state  s. 

•  A  probabilistic  transition  function  6  :  Sx  M  x  M  ^ 
T>{S),  which  gives  the  probability  S(s,a  1,02) (t)  of  a 
transition  from  s  to  t  when  player  1  plays  move  ai 
and  player  2  plays  move  02,  for  all  s,t  Q  S  and 
ai  e  ri(s),  02  e  r2(s).  ■ 

We  define  the  size  of  the  game  structure  G  to  be  equal 
to  the  size  of  the  transition  function  S;  specifically, 

1^1  =  X]aeri(s)  Z]6er2(s)  ^)(^)l’ 

where  o,  6)(t)|  denotes  the  space  to  specify  the 
probability  distribution.  We  write  n  to  denote  the 
size  of  the  state  space,  i.e.,  n  =  |S'|.  At  every  state 
s  €  S,  player  1  chooses  a  move  oi  G  ri(s),  and  simul¬ 
taneously  and  independently  player  2  chooses  a  move 
02  G  r2(s).  The  game  then  proceeds  to  the  successor 
state  t  with  probability  (5(s,  oi,  02)(t),  for  all  t  G  S.  A 
state  s  is  called  an  absorbing  state  if  for  all  oi  G  ri(s) 
and  02  G  r2(s)  we  have  (i(s, oi, 02)(s)  =  1.  In  other 
words,  at  s  for  all  choices  of  moves  of  the  players  the 
next  state  is  always  s.  A  state  s  is  a  turn-based  state  if 
there  exists  i  G  {1,2}  such  that  |ri(s)|  =  1.  Moreover, 
if  |r2(s)|  =  1  then  the  state  s  is  a  player-1  turn-based 
state  since  the  choice  of  moves  for  player  2  is  trivial;  and 
if  |ri(s)|  =  1  then  it  is  a  player-2  turn-based  state.  For 
all  states  s  G  S  and  moves  oi  S  ri(s)  and  02  G  r2(s), 
we  indicate  by  Dest(s,  oi,  02)  =  Supp((i(s,  oi,  02))  the 
set  of  possible  successors  of  s  when  moves  oi,  02  are 
selected. 

Plays.  A  path  or  a  play  w  of  G  is  an  infinite  sequence 
LO  =  {so,si,S2, . . .)  of  states  in  S  such  that  for  all  A:  >  0, 
there  are  moves  G  ri(sfc)  and  G  T2{sk)  with 

5(sfc,  oj,  02)(sfc-i-i)  >  0.  We  denote  by  fl  the  set  of  all 
paths  and  by  fig  the  set  of  all  paths  to  =  (sq,  si,  S2,  ■  •  ■) 
such  that  So  =  s,  i.e.,  the  set  of  plays  that  start  from 
the  state  s. 

Randomized  strategies.  A  selector  f  for  player 
i  G  {  1,  2  }  is  a  function  f  :  S  ^  'D{M)  such  that  for  all 
s  G  S  and  a  G  M,  if  ^(s)(a)  >  0  then  a  G  rj(s).  We 
denote  by  A^  the  set  of  all  selectors  for  player  i  G  {1,2}. 
A  selector  f  is  pure  if  for  every  s  G  S  there  exists  a  G  M 
such  that  ^(s)(a)  =  1;  we  denote  by  Af  C  Ai  the  set 
of  pure  selectors  for  player  i.  A  strategy  for  player  1 
is  a  function  a  :  Ai  that  associates  with  every 

finite  non-empty  sequence  of  states,  representing  the 


history  of  the  play  so  far,  a  selector.  Similarly  we  define 
strategies  tt  for  player  2.  A  strategy  cr  for  player  i  is  pure 
if  it  yields  only  pure  selectors,  that  is,  if  it  is  of  type 
5'+  ^  Af .  A  memoryless  strategy  is  independent  of 
the  history  of  the  play  and  depends  only  on  the  current 
state.  Memoryless  strategies  coincide  with  selectors, 
and  we  often  write  a  for  the  selector  corresponding  to  a 
memory  less  strategy  cr.  A  strategy  is  pure  memory  less 
if  it  is  pure  and  memory  less.  We  denote  by  S  and 
n  the  set  of  all  strategies  for  player  1  and  player  2, 
respectively. 

Once  the  starting  state  s  and  the  strategies  cr  and 
TT  for  the  two  players  have  been  chosen,  the  game  is 
reduced  to  an  ordinary  stochastic  process.  Hence,  the 
probabilities  of  events  are  uniquely  defined,  where  an 
event  A  C  is  a  measurable  set  of  paths.  For  an 
event  A  C  we  denote  by  Pr^’’^(M)  the  probability 
that  a  path  belongs  to  A  when  the  game  starts  from  s 
and  the  players  follow  the  strategies  a  and  tt.  For  A  >  0, 
we  also  denote  by  0^  :  — >  S'  the  random  variable 

denoting  the  i-th  state  along  a  path. 

Objectives.  We  specify  objectives  for  the  players  by 
providing  the  set  of  winning  plays  $  C  O  for  each 
player.  In  this  paper  we  study  only  zero-sum  games 
[21,  11],  where  the  objectives  of  the  two  players  are 
strictly  competitive.  In  other  words,  it  is  implicit  that 
if  the  objective  of  one  player  is  $,  then  the  objective  of 
the  other  player  is  O  \  <I>.  Given  a  game  graph  G  and  an 
objective  $  C  O,  we  write  (G,  $)  for  the  game  played 
on  the  graph  G  with  the  objective  $  for  player  1. 

A  general  class  of  objectives  are  the  Borel  objec¬ 
tives  [13].  A  Borel  objective  $  C  S‘^  is  a  Borel  set 
in  the  Cantor  topology  on  S“.  In  this  paper  we  con¬ 
sider  Lo-regular  objectives  [26],  which  lie  in  the  first 
21/2  levels  of  the  Borel  hierarchy  (i.e.,  in  the  inter¬ 
section  of  E3  and  Ha).  The  w-regular  objectives,  and 
subclasses  thereof,  can  be  specified  in  the  following 
forms.  For  a  play  to  =  (sq,  si,  S2,  •  •  •)  G  H,  we  define 
Inf(a;)  =  {  s  G  A  |  Sfc  =  s  for  infinitely  many  A:  >  0  }  to 
be  the  set  of  states  that  occur  infinitely  often  in  ui. 

•  Reachability  and  safety  objectives.  Given  a  set 
T  C  S'  of  “target”  states,  the  reachability  objective 
requires  that  some  state  of  T  be  visited.  The 
set  of  winning  plays  is  thus  Reach  (T)  =  {  w  = 
(so,  si,  S2, . . .)  G  H  I  Sfc  G  T  for  some  A;  >  0  }. 
Given  a  set  A  C  S,  the  safety  objective  requires 
that  only  states  of  F  be  visited.  Thus,  the  set  of 
winning  plays  is  Safe(F)  =  {  w  =  (sq,  si,  S2, . . .)  G 
fl  I  Sfe  G  F  for  all  A:  >  0  }. 

•  Biichi  and  coBuchi  objectives.  Given  a  set  H  C  S 
of  “Biichi”  states,  the  Biichi  objective  requires 
that  B  is  visited  infinitely  often.  Formally,  the 
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set  of  winning  plays  is  Buchi(i3)  =  {  w  G  fl 
Inf(w)  n  B  ^  0  }.  Given  CCS,  the  coBiichi 
objective  requires  that  all  states  visited  infinitely 
often  are  in  C.  Formally,  the  set  of  winning  plays 
is  coBuchi(C')  =  {  w  G  n  |  Inf(u;)  C  C  }. 

•  Parity  objectives.  For  c,d  G  N,  we  let  [c-.d]  = 
{  c,  c  +  1, . . . ,  d  }.  Let  p  :  S  ^  [0..d]  be  a  function 
that  assigns  a  priority  p{s)  to  every  state  s  €  S, 
where  d  G  N.  The  Even  parity  objective  is  defined 
as  Parity(p)  =  {  w  G  Sd  |  min  (p(Inf(a;)))  is  even  }, 
and  the  Odd  parity  objective  as  coParity(p)  = 
{  w  G  n  I  min  (p(Inf(w)))  is  odd  }.  Informally 
we  say  that  a  path  ui  satisfies  the  parity  objective, 
Parity (p),  if  w  G  Parity (p).  Note  that  for  a 

priority  function  p  :  V  {0,1},  an  even 
parity  objective  Parity(p)  is  equivalent  to  the  Biichi 
objective  Biichi(p“^(0)),  i.e.,  the  Biichi  set  consists 
of  the  states  with  priority  0.  Hence  Biichi  and 
coBiichi  objectives  are  simpler  and  special  cases  of 
parity  objectives. 

Given  any  parity  objective,  we  write  Hg  to  denote 
Parity(p);  this  set  is  measurable  for  any  choice  of 
strategies  for  the  two  players  [27].  Similarly  we  write 
Oq  to  denote  coParity(p).  Note  that  Hg  n  Hq  =  0  and 
OgUHo  =  H.  Given  a  state  s  we  write  flgg  to  denote  HsH 
Og  and  similarly  we  write  Qos  to  denote  fig  OHo.  Hence, 
the  probability  that  a  path  satisfies  objective  Parity(p) 
starting  from  state  s  G  S',  given  the  strategies  a,  tt  for 
the  players  is  Prg’'^(flgs).  Given  a  state  s  G  S  and  a 
parity  objective.  Parity (p),  we  are  interested  in  finding 
the  maximal  probability  with  which  player  1  can  ensure 
that  Parity(p)  and  player  2  can  ensure  that  coParity(p) 
holds  from  s.  We  call  such  probability  the  value  of  the 
game  G  at  s  for  player  i  G  {1,2}.  The  value  for  player  1 
and  player  2  are  given  by  the  function  {{!)) valine)  '■ 
S  [0,1]  and  ((2))„a;(f2o)  :  S  ^  [0,1],  defined  for 
all  s  G  S  by  {{l))vai{^e)is)  =  sup^gsinf,renPrg’’"(flgg) 
and  {{2))yai{0.o){s)  =  sup^gn  inf^es  Note 

that  the  objectives  of  the  players  are  complementary 
and  hence  we  have  a  zero-sum  game.  Goncurrent 
games  satisfy  a  quantitative  version  of  determinacy  [15], 
stating  that  for  all  parity  objectives,  and  all  s  €  S,  we 
have  ((l))„oi(f^e)(s)  -h  ((2))™;(Ho)(s)  =  1.  A  strategy 
a  for  player  1  is  optimal  if  for  all  s  G  S  we  have 
inf^enPrg’’'(Hgs)  =  ((l)).„Q;(Hg)(s).  For  £  >  0,  a 
strategy  a  for  player  1  is  s- optimal  if  for  all  s  G  S 
we  have  inf,renPrg’’'(Hgs)  >  ((l))„ai(f2g)(s)  -  e.  We 
define  optimal  and  e-optimal  strategies  for  player  2 
symmetrically.  Note  that  the  quantitative  determinacy 
of  concurrent  games  is  equivalent  to  the  existence  of  e- 
optimal  strategies  for  both  players,  for  all  e:  >  0,  at  all 
states  s  €  S. 


The  branching  strnctnre  of  plays.  Many  of  the  ar¬ 
guments  developed  in  this  paper  rely  on  a  detailed  anal¬ 
ysis  of  the  branching  process  resulting  from  the  strate¬ 
gies  chosen  by  the  players,  and  from  the  probabilistic 
transition  relation  of  the  game.  In  order  to  make  our 
arguments  precise,  we  need  some  definitions.  A  play  is 
feasible  if  each  of  its  transitions  could  have  arisen  ac¬ 
cording  to  the  transition  relation  of  the  game. 

Definition  2.2.  (Feasible  plays  and  outcomes) 
Given  two  strategies  a  for  player  1  and  it  for  player  2, 
a  play  oj  =  (sg,  si,  S2,  ■  ■  ■}  is  feasible  in  a  concurrent 
game  structure  G  if  for  every  fc  G  N  the  following 
conditions  hold  for  some  ai  G  ri(sfe)  and  02  G  r2(sfc).' 
(1)  Sk+i  G  Dest(sfe,  01,02);  (2)  cr(so,si,...,Sfc)(ai)  > 
0;  and  (3)  7r(so,  sij  •  ■  • ,  Sfc)(a2)  >  0.  Given  strategies 
(T  G  S  and  tt  G  H,  and  a  state  s,  we  denote  by 
Outcome(s,  CT,  tt)  C  Hg  the  set  of  feasible  plays  that 
start  from  s,  given  the  strategies  a  and  tt.  I 

In  order  to  make  precise  statements  about  the  branching 
process  arising  from  a  play,  we  define  trees  labeled  by 
game  states. 

Definition  2.3.  (Infinite  trees,  S'-labeled 
trees,  and  trees  for  events)  An  infinite  tree  is  a 
set  Tr  C  N*  such  that  (a)  if  x  ■  i  G  Tr,  where  x  G  N* 
and  i  G  N,  then  x  G  Tr;  (b)  for  all  x  €  Tr  there  exists 
1  G  N  such  that  x  •  i  €  Tr.  We  refer  to  x  ■  i  as  a 
successor  of  x.  We  call  the  elements  in  Tr  as  nodes 
and  the  empty  word  e  is  the  root  of  the  tree.  An  infinite 
path  T  0/ Tr  is  a  set  r  C  Tr  such  that  (a)  e  G  r;  (b) 
for  every  x  in  t  there  is  an  unique  i  G  N  such  that 
X  ■  i  G  T.  Note  that  for  every  z  G  N,  there  is  an  unique 
element  x  G  t  such  that  \x\  =  i.  We  denote  by  Ti  the 
element  x  G  t  such  that  |a;|  =  i.  Given  an  infinite 
tree  Tr  and  a  node  x  G  Tr,  we  denote  by  Tr(a;)  the 
sub-tree  rooted  at  node  x.  Formally,  Tr(a:)  denotes  the 
set  {x'  GW  \  X  is  a  prefix  of  x'  } . 

A  S-labeled  tree  T  is  a  pair  (Tr,  (•)),  where  Tr  is  a 
tree  and  (•)  :  Tr  ^  S'  maps  each  node  of  Tr  to  a  state 
s  G  S.  Given  a  S-labelled  tree  T,  and  a  infinite  path 
T  C  Tr,  we  denote  by  (t)  the  play  {sg,  si,  S2,  ■ .  ■) ,  such 
that  sg  =  (e)  and  for  all  i  >  0  we  have  Si  =  (xi).  A 
S-labeled  tree  Tg  =  (Trg,(-)),  where  (e)  =  s,  represents 
a  set  of  infinite  paths,  denoted  as  C{'Ps)  C  Hg,  such 
that  £{Ts)  =  {  uj  =  {sg  =  s,si,S2,...)  G  Hg  |  3t  C 
Trg.  (t)  =  to  }.  A  S-labeled  tree  Tg  represents  an  event 
A  C  Hg  if  and  only  if  C(Tg)  =  A.  t 

Trees  for  outcomes  and  events.  Let  T  =  (Tr,  (•)) 
be  a  S-labeled  tree  and  consider  a;  G  Tr  such  that 
\x\  =  n.  We  denote  by  Xi  the  prefix  of  x  of  length 
z.  We  denote  by  hist(a;)  =  {{e),{xi), . . .  ,{xn))  the 


history  represented  by  the  path  from  the  root  to 
the  node  x.  Given  strategies  a  and  tt,  and  a  state 
s,  a  S'-labelled  tree  =  (Tr^’’^,  (•))  to  represent 

Outcome(s,  CT,  tt)  is  defined  as  follows:  (a)  (e)  =  s; 
(b)  for  X  G  Tr^’"^,  let  \x\  =  n,  and  consider  the  set  U  = 

U{  <T(hist(a:))(oi)>0,7r(hist(a;))(a2)>0  }  DeSt  ( (cCn)  ,  Oi ,  02) . 

The  set  of  successors  for  x  in  the  tree  is  x  ■  j  for 
j  G  {  1,  2, . . . ,  |[/|  },  and  the  labeling  function  (•)  is  a 
bijection  from  the  successors  of  x  to  the  set  U  of  states. 
For  an  event  A,  the  stochastic  tree,  7^’J  =  (•)) 

is  constructed  from  Tf’’^  by  retaining  the  set  of  paths 
A  n  Outcome(s,  a,  tt).^  We  denote  by  Cone(a;)  =  {  w  = 
{so,  si,  S2,  ■  ■  ■)  \  {xi)  =  Si  for  all  0  <  I  <  n  }  the  set 
of  paths  with  the  prefix  hist(a:).  Given  a  measurable 
event  .A  C  along  with  strategies  a  and  tt  such 
that  Pr^’'^(A)  >  0,  consider  the  S'-labeled  tree 
to  represent  A  fl  Outcome(s,  a,  tt).  Gonsider  the  event 
Anil  =  {Gone(a;)  |  x  G  Tr^^.  Pr^’'^(Cone(a;)  n  A)  =  0}. 
Since  AnU  is  the  countable  union  of  measurable  sets 
each  with  measure  0  we  have  Pr^’'^(ATO;  fl  A)  =  0. 
Hence  in  the  sequel,  without  loss  of  generality,  given 
any  event  A  we  only  consider  the  event  A  \  A„i; ,  and 
with  a  little  abuse  of  notation  we  use  7^’J  to  represent 
the  stochastic  tree  g-  Furthermore,  again 

without  loss  of  generality,  we  assume  that  for  any 
x  G  Tr^’^  we  have  Prg’’^(Cone(a;)  fl  A)  >  0.  Henceforth, 
for  any  x  G  Tr'^’’^  we  write  Pr^’’^(i3  I  A)  to  denote 
Pr^iB  I  Gone(xt  A). 

Definition  2.4.  (Perennial  e-optimal  strategies) 
Given  e  >  0,  a  strategy  a  is  a  perennial  e-optimal 
strategy  for  player  1,  from  state  s,  if  for  all  strategies  tt 
and  for  all  nodes  x  in  the  stoehastic  tree  Tr^’’^,  we  have 
PrJ’’^(Hes)  >  {{l))vaii^e)i{x))  —  s,  i.e.,  in  the  stochastie 
sub-tree  rooted  at  x  player  1  is  ensured  the  value  of 
the  game  at  (x)  within  e.  The  perennial  s-optimal 
strategies  for  player  2  are  defined  analogously.  We 
denote  by  Eg  and  H^  the  sets  of  perennial  e-optimal 
strategies  for  player  1  and  player  2,  respectively.  I 

The  £-optimal  strategies  constructed  for  parity  objec¬ 
tives  in  [8]  are  perennial  £-optimal  strategies.  This  leads 
to  the  following  result. 

Proposition  2.1.  For  all  e  >  0,  we  have  Eg  7^  0  and 
He  7^0. 

3  Results 

In  this  section  we  construct  polynomial  witnesses  for 
perennial  £-optimal  strategies  and  describe  a  polyno¬ 
mial  procedure  to  verify  the  witnesses.  As  an  immedi¬ 
ate  consequence,  the  values  of  concurrent  parity  games 

^Note  that  the  stochastic  tree  is  not  constructed  by 

extending  every  finite  prefix  of  paths. 


can  be  decided  within  £-precision  in  NP  n  coNP.  Since 
the  values  can  be  irrational,  one  can  only  hope  to  e- 
approximate  the  values.  Our  proof  techniques  reveal 
several  key  characteristics  of  perennial  £-optimal  strate¬ 
gies.  In  general,  perennial  £-optimal  strategies  require 
infinite  memory  [7,  8].  We  show  that  even  though  the 
perennial  £-optimal  strategies  require  infinite  memory 
in  general,  there  exist  perennial  £-optimal  strategies 
that  in  the  limit  for  £  — >  0  converge  to  memoryless 
strategies.  This  result  parallels  with  the  celebrated  re¬ 
sult  of  Mertens-Neyman  [18]  for  concurrent  games  with 
limit-average  objectives,  which  states  that  there  exist  £- 
optimal  strategies  that  in  the  limit  coincide  with  mem¬ 
oryless  strategies  (the  memoryless  strategy  correspond 
to  the  memoryless  optimal  strategies  in  the  discounted 
game  with  discount  factor  very  close  to  0).  However,  the 
memoryless  strategies  to  which  the  £-optimal  strategies 
converge  are  not  necessarily  £-optimal  themselves. 

In  concurrent  games  with  safety  objectives,  opti¬ 
mal  memoryless  strategies  always  exist,  and  the  optimal 
strategies  in  general  require  randomization  [11].  In  case 
of  concurrent  games  with  reachability  objectives,  opti¬ 
mal  strategies  need  not  exist,  but  memoryless  £-optimal 
strategies  exist  for  all  £  >  0  [11]  and  the  £-optimal 
strategies  require  randomization.  In  case  of  concurrent 
games  with  Biichi  objectives,  £-optimal  strategies  re¬ 
quire  infinite  memory  in  general  [7].  In  contrast,  we 
show  that  for  all  £  >  0,  memoryless  £-optimal  strate¬ 
gies  exist  for  all  concurrent  games  with  coBiichi  objec¬ 
tives;  it  follows  from  the  simpler  case  of  reachability 
objectives  that  optimal  strategies  need  not  exist  and 
£-optimal  strategies  require  randomization.  It  follows 
from  the  results  on  Biichi  objectives  that  for  concurrent 
parity  games  with  with  3  or  more  priorities,  £-optimal 
strategies  require  in  general  infinite  memory.  Our  re¬ 
sults  thus  complete  the  characterization  of  the  memory 
requirements  of  £-optimal  strategies  in  concurrent  par¬ 
ity  games. 

Reachability  properties.  Several  key  properties  of 
perennial  £-optimal  strategies  will  follow  by  analyzing 
the  behavior  of  the  strategies  with  respect  to  some 
reachability  and  safety  objectives.  In  the  sequel,  we 
consider  stochastic  trees  such  that  Pr^’'^(A)  >  0. 
Given  a  stochastic  tree  let  «;  be  a  subset  of 

nodes,  i.e.,  n  C  Tr^’^.  Analogous  to  the  definition  of 
reachability  and  safety  we  define  the  following  notions 
of  reachability  and  safety  in  the  stochastic  tree: 

1.  Reachability  in  tree.  For  a  set  k  C 
Tr^’^,  let  ReachTree(«:)  =  {  (t)  | 

r  is  an  infinite  path  in  Tr^’^  such  that  exists  i  G 
N.  Tj  G  «;  },  denote  the  set  of  paths  that  reach  the 
subset  K  of  nodes. 
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2.  Safety  in  tree.  For  a  set  k  C 

Tr^’^,  let  SafeTree(K)  =  {  (t)  I 

T  is  an  infinite  path  in  such  that  for  all  i  G 

N.  Ti  G  k},  denote  the  set  of  paths  that  stay  safe 
in  the  subset  k  of  nodes. 

Given  a  positive  integer  k  and  a  set  n  C  Tr^’^,  we  define 
by  ReachTree^(/t)  =  {  (t)  |  3  x  S  r.  3  i  <  A:,  G  k  }, 
i.e.,  the  set  of  paths  that  reaches  k  within  k  steps. 

Lemma  3.1.  (Reachability  Lemma)  Let  be  a 

stochastic  tree. 

1.  For  a  set  K  C  Tr^)^,  if 

inf^gTr^'’"  {ReachTree{K)  |  ^)  >  0,  then 

{ReachTree{K)  |  ^)  =  1,  for  all  nodes 

e  Tr^.s- 

2.  For  a  set  U  C  S,  if  inf^-gTr"'”  {Reach(U)  \ 
A)  >  0,  then  Prf.'^ {Reach(U)  \  A)  =  1,  for  all 
nodes  x  S  Tr)^’^. 

Proof.  We  prove  the  first  case  and  show  that  the  second 
case  is  an  immediate  consequence. 

1.  Let  0  <  c  <  inf2,grpj.^,^  PrJ’’^(ReachTree(K)  |  ^). 

Chose  0  <  c'  <  c.  For  every  node  x  G  Tr)^’^,  there 
exists  kx  such  that  Pr^’’^(ReachTree^*  (k)  |  A)  >  c' . 
Consider  fci  =  k^  (recall  that  e  is  the  root  of 
the  tree)  and  consider  the  frontier  Fi  of  Tr^’^  at 
depth  ki.  Given  a  frontier  F  at  depth  k,  let  F 
be  the  set  of  nodes  x  in  F’  such  that  the  path 
from  the  root  to  x  has  not  visited  a  node  in  k, 
i.e.,  none  of  e,  xi,  X2, . . . ,  x\x\  is  in  k.  For  a  frontier 
Fi,  define  fci+i  =  max{fca;  |  x  G  Fi}.  Inductively, 
define  the  frontier  F^+i  at  depth  it 

follows  that  for  k  =  "^27=1  have  Pr^’’^)!!  \ 

ReachTree^(«;)  |  ^)  <  (1  —  c')”.  Since  lim„^oo(l  — 
c')”  =  0,  the  desired  result  follows  for  the  root 
of  the  tree.  Since  inf^.g-p^.^.’^  Pr^’’^(ReachTree(K)  | 
A)  >  0,  it  follows  that  for  all  nodes  x  G  Tr^’}  we 
have  inf,j,^gPj,^,,r^^^  PrJ’^’^(ReachTree(K)  |  ^)  >  0. 
Arguing  similarly  for  the  subtree  rooted  at  the  node 
X  the  desired  result  follows. 

2.  Observe  that  with  k  =  {  x  G  Tr)^’}  |  (x)  G  C/  }, 
we  have  Reach({7)  =  ReachTree(K).  The  result  is 
immediate  from  part  1.  I 

Notation.  Let  A  C  Og  be  a  measurable  event  such 
that  Prg’'^(A)  >  0.  For  a  set  R  C  S,  let  InfSet(il)  = 
{uj  I  Inf(w)  C  B}  and  InfSetEq(il)  =  {u  \  Inf(a;)  =  B}. 
Given  a  node  x  in  Tr^’},  and  £  >  0,  we  define  C‘2l2{x) 
as  C(^(^(x)  =  {BQS  f  Pr^’’"(InfSet(B)  |  A)  >  1  -  £  }. 


Note  that  for  £i  >  0  and  £2  >  0  such  that  £1  <  £2,  for  all 
nodes  x  G  Tr^’},  if  R  G  then  B  G  We 

define  by  C^^’^^i^x)  =  limg^o  ^^(^(a;).  The  monotonicity 
property  of  with  respect  to  e  ensures  that  {x) 
exists  for  every  x  G  Tr^’}. 

Lemma  3.2.  For  all  nodes  x  G  Tr^’},  there  is  a  unique 
minimal  element  of  (x)  under  C  ordering. 

We  define  the  function  :  Tr^)}  — >  2'^  that 

assigns  to  every  node  x  G  Tr^’}  the  minimum  element  of 
C^^(x).  Formally,  we  have  {x)  =  nB6c;^'’"(a;)  ^  = 
limg^o  ClBGC^iA 

Proposition  3.1.  For  every  x  G  Tr^’}  and  for  every 
successor  X I  of  x  we  have  Al^^(xi)  C  Al(^^(x). 

Lemma  3.3.  Given  a  S -labeled  tree  for  all  nodes 
X  G  Tr^’},  for  all  s  >  0,  there  is  a  set  B  C  S,  and  xi  G 
Tr^’}(xj,  such  that  Pr^’^’^(InfSetEq(i?)  |  A)  >  1  —  £. 

Proof.  The  proof  is  by  induction  on  |Al^^(x)|. 

Base  Case.  If  {x)\  =  1,  let  {x)  =  {s}.  Then 
for  all  nodes  xi  G  Tr^’}(x)  we  have  Pr^^'^(InfSet({s})  | 
A)  >  1  —  £,  for  all  £  >  0.  Thus  for  all  nodes  xi  G 
Tr(^’}(x),  for  all  £  >  0,  we  have  Pr^^'^(InfSetEq({s})  | 
A)  >  1  -  £. 

Inductive  Case.  Suppose  there  exists  a  node  xi  G 
Tr^’}(x)  such  that  Al(^^(xi)  C  Al^’^(x),  then 
{xi)\  <  |Al(5’^(x)|  and  the  result  follows 

by  inductive  hypothesis  at  xi.  Otherwise  for  ev¬ 
ery  node  xi  G  Tr^’}(x)  we  have  (xi)  = 

M.7^{x).  Let  the  set  {x)  be  B.  We  have 

hme^O  na;ieTr;^’}A^) 

•  Suppose  we  have  inf^.^gPj.^.’r^,,,)  Pr^^'^(Reach({  s  })  | 
A)  >  0,  for  all  states  s  €  B.  Then  it  follows  from 
Lemma  3.1  that  for  all  nodes  xi  G  Tr^’}(x)  we  have 
Pr^’^"^ (Reach) {  s  })  |  A)  =  1.  Hence,  for  all  nodes 
xi  G  Tr^’}(x)  we  have  Pr^’^’^(InfSetEq(i?)  |  A)  =  1. 

•  Otherwise,  consider  a  state  s  €  B  such  that 

in4ieTr-'-G)  Pr^f  (R-each({s})  |  A)  =  0.  For  every 
£  >  0,  there  must  be  a  node  xi  G  Tr^’}(x)  such  that 
Pr^’^’^(InfSet(i?  \  {  s  })  |  A)  >  1  —  £.  Thus,  we  have 
lime^o  naieTr;^’}^^;)  ( nBeC^’"(a:i) -^)  -  \  ®  }• 

This  is  a  contradiction  to  the  fact  that  for  all 
nodes  xi  G  Tr^’}(x)  we  have  (xi)  =  B  (i.e., 

hme^o  na;ieTr;^’}A^) 
desired  result  follows.  I 

Lemma  3.4.  For  every  stochastic  tree  TJ’J,  for  every 
node  X  G  Tr)^’}  one  of  the  following  conditions  hold:  (a) 


for  all  s  >  0,  there  is  a  node  xi  €  Tr^’^(a;)  such  that 
{ries  I  »4)  >  1  —  or  (b)  for  all  £  >  0,  there  is  a 
node  xi  G  Tr^’^(a;)  such  that  Pr^’^’^(rios  |  ^)  >  1  —  e. 

Lemma  3.4  is  an  easy  consequence  of  Lemma  3.3.  In  the 
sequel,  we  denote  by  Wi  =  {s  j  {{l))vai{^e){s)  =  1}  and 
W2  =  {  s  I  {{2))yai{Plo){s)  =  1  }  the  set  of  states  where 
player  1  and  player  2  can  achieve  value  1,  respectively. 
We  will  prove  that  if  both  players  play  one  of  their 
perennial  e-optimal  strategies,  with  £  ^  0,  then  the 
play  reaches  Wi  U  W2  with  probability  1.  For  a  set 
T  C  S'  we  denote  by  T  the  set  S  \  T.  Given  a  state 
s  and  a  set  T  of  vertices  we  write  Safes(r)  to  denote 
Safe(T)  n  Pis  and  Reachs(T)  to  denote  Reach(T)  n  Pis- 

Lemma  3.5.  (Reachability  with  £-optimal 
strategies)  Given  a  game  structure  G,  consider  a 
strategy  pair  (cr,  tt)  G  x  Re,  for  sufficiently  small  e. 
For  all  states  s  and  for  all  nodes  x  G  Tr^’’^  we  have 
Prr(Sa/e,(WiUW2))=0. 

Proof.  Fix  ?7  >  0,  such  that  0  <  2  •  77  <  a  = 
min{((l))™;(Re)(s),  ((2))„oi(Ro)(s)  I  s  G  IFi  UIF2},  i.e., 
a  is  the  least  positive  value  for  player  1  or  player  2.  Con¬ 
sider  a  strategy  pair  {a,  tt)  G  S,,  x  11^,  i.e.,  the  strategies 
are  perennial  vy-optimal  strategies.  Let  Uf’'^  =  {x  € 

I  s  G  Wi  U  W2  and  PrJ’’"(Safes(IFi  U  IF2))  >  0}. 
If  is  empty  the  desired  result  follows. 

Assume  for  the  sake  of  contradiction  that 

is  non-empty.  Let  a;  be  a  node  in  and  con¬ 

sider  the  S-labeled  subtree  Tf’'^{x)  rooted  at 
X.  Since  Pr^’’^(Safes(IFi  U IF2))  >  0,  we  must 

have  inf,j,^g-Pj.j,7r^^^  Pr^^'^(Reachs(IFi  U  IF2))  =  0, 

or  sup,^^gTr?'R^)  Pr^f  (Safes (IFi  U  IF2))  =  I. 

In  fact,  from  Lemma  3.1  we  have  that 
infa;ieTrJ-’'(a;)  Pr^|^'^(Reachs(IFi  U  IF2))  >  0  implies 

Pr^’’"(Rkchs(IFi  U  IF2))  =  1. 

Consider  a  node  xi  G  Tr^’'^(x)  such  that 
Pr^(’^(Safes(IFi  U  IF2))  >1—77.  Let  A  be  the  event 
Safes  (IFi  U  IF2).  Since  a  and  tt  are  perennial  r7-optimal 
strategies,  and  {A)  >  1  —  77,  it  follows  that  for  ev¬ 
ery  node  X2  G  Tr(^^(a;i)  we  have  (Res  |  A)  >  ci  > 

{a  -  2rfj  >  0  and  (flos  |  A)  >  C2  >  (a  —  2ri)  >  0. 

This  implies  that  for  all  nodes  X2  G  Tr^’^(xi)  we  have 
Prjf  (Res  I  A)  <  1  -  C2  and  PTf.’^{Plos  |  A)  <  1  -  Ci. 
It  follows  from  Lemma  3.4  that  for  every  £  >  0,  there 
is  a  node  X2  G  Tr^’^(xi)  such  that  either  Pr^^’^(Res  | 
A)  >  1  —  £  or  Pr^^^  (Ros  I  A)  >  1  —  £.  Since  ci  and  C2 
are  constants  greater  than  0,  we  have  a  contradiction. 
Hence  =  0  and  the  result  follows.  I 

Reduction  to  qualitative  witness.  The  notion  of 
local  optimality  plays  an  important  role  in  our  con¬ 
struction  of  polynomial  witnesses.  Informally,  a  selec¬ 
tor  function  ^  is  locally  optimal  if  it  is  optimal  in  the 


one-step  matrix  game  where  each  state  is  assigned  a  re¬ 
ward  value  ((l))„o;(Re)(s).  A  locally  optimal  strategy  is 
a  strategy  that  consists  of  locally  optimal  selectors.  A 
locally  e-optimal  strategy  is  a  strategy  that  has  a  to¬ 
tal  deviation  from  locally-optimal  selectors  of  at  most 
£.  Locally  optimal  selectors  and  strategies  play  a  role 
in  the  construction  of  polynomial  witnesses,  since  local 
optimality  is  a  notion  that  can  be  checked  in  polynomial 
time. 

We  note  that  local  e-optimality  and  e-optimality  are 
very  different  notions.  Local  e-optimality  consists  of  the 
approximation  of  a  local  selector;  a  locally  £-optimal 
strategy  provides  no  guarantee  of  yielding  a  probability 
of  winning  the  game  close  to  the  optimal  one.  On  the 
other  hand,  an  £-optimal  strategy  is  a  strategy  that 
guarantees  a  probability  of  winning  close  to  the  optimal 
one;  there  are  no  constraints  on  its  local  structure.  Our 
polynomial  witnesses  will  consist  in  strategies  that  are 
locally  £-optimal  (which  can  be  checked  in  polynomial 
time),  and  that  have  a  particular  structure  that  ensures 
their  global  £-optimality. 

Definition  3.1.  (Locally  £-optimal  selectors 
AND  strategies)  A  selector  f  is  locally  optimal  if  for 
all  s  €  S  and  02  G  rs(s)  we  have  E[((l))t,a;(Re)(0i) 
s,^(s),a2]  >  ((I)).„a;(Re)(s).  IFe  denote  by  A^  the 
set  of  locally-optimal  selectors.  A  strategy  a  is  lo¬ 
cally  optimal  if  for  every  history  (sq,  si, . . . ,  Sk)  we  have 
a{so,  si, . . . ,  Sk)  G  A^,  i.e.,  player  1  plays  a  locally  op¬ 
timal  selector  at  every  stage  of  the  play.  We  denote  by 

the  set  of  locally  optimal  strategies.  A  strategy  is 
locally  £-optimal  if  for  every  strategy  tt  G  H  and  for  ev¬ 
ery  to  =  {sq,  si,  S2, .  ■  ■ ,)  G  Outcome(s,  CTg,  tt)  we  have 

“  E[((l)).„Q;(Re)(0fc+l) 

Sfe,(Je(wfe),7r(a;fe)]),0})  <  £,  where  ujk  =  (so.si,  ■  •  ■  ,Sfc)- 
We  denote  by  Ef  the  set  of  locally  e-optimal  strategies. 

■ 

Observe  that  a  strategy  that  at  each  round  i  chooses  a 
locally  optimal  selector  with  probability  at  least  (I  — £i), 
with  <  e,  is  a  locally  £-optimal  strategy.  A 

value  class  of  the  game  is  the  set  of  all  states  where  the 
game  has  a  given  value.  A  value  class  VC(r)  is  the  set  of 
states  s  such  that  the  value  for  player  I  is  r.  Eormally, 
VC(r)  =  {s  I  ((I))«ai(Re)(s)  =  r}.  Intuitively,  we  can 
picture  the  game  as  a  “quilt”  of  value  classes.  Two  of  the 
value  classes  correspond  to  values  1  (player  1  wins  with 
probability  arbitrarily  close  to  I)  and  0  (player  2  wins 
with  probability  arbitrarily  close  to  I);  the  other  value 
classes  correspond  to  intermediate  values.  We  construct 
a  polynomial  witness  in  a  piece-meal  fashion.  We  first 
show  that  we  can  construct,  for  each  intermediate  value 
class,  a  strategy  that  with  probability  arbitrarily  close 
to  I  guarantees  either  leaving  the  class,  or  winning 
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without  leaving  the  class.  Such  a  strategy  can  be 
constructed  using  results  from  [7],  and  has  a  polynomial 
witness.  Second,  we  show  that  the  above  strategy  can 
be  constructed  so  that  when  the  class  is  left,  it  is  left 
via  a  locally  e-optimal  selector.  By  stitching  together 
the  strategies  constructed  in  this  fashion  for  the  various 
value  classes,  we  will  obtain  a  single  polynomial  witness 
for  the  complete  game.  The  construction  of  a  strategy 
in  a  value  class  relies  on  the  following  reduction. 

Reduction.  Let  G  =  (S',  M,  Ti,  r2,  (5)  be  a  concurrent 
game  with  parity  objectives  Parity(p)  and  coParity(p) 
for  player  1  and  player  2  respectively,  and  let  the 
priority  function  be  p.  For  a  state  s  S  S,  we  define 
the  set  of  allowable  supports  OptSupps(s)  =  {  7  C 
ri(s)  I  G  A^.Supp(^f)  =  7  }  to  be  the  set  of 
supports  of  locally  optimal  selectors.  For  every  s  G  S, 
we  assume  that  we  have  a  fixed  way  to  enumerate 
OptSupps(s)  =  {  71, 72,  • .  • ,  7n  }•  Consider  a  value  class 
VC(r)  with  0  <  r  <  1.  We  construct  a  concurrent 
game  Gr  =  (Sr,  M ,Ti,T2,6)  with  a  priority  function  p 
as  follows: 

1.  State  space.  Sr  =  {  s  |  s  G  VC(r)  }  U  {  wi,W2  }  U 
{(s,i)  I  s  G  VC(r),i  G  {  1,2, .. .,  |OptSupps(s)|  }}. 

2.  Priority  function,  (a)  p(s)  =  p{s)  for  all  s  G  VC(r); 
(b)  p{{s,i))  =  p{s)  for  all  {s,i)  G  Sr]  and  (c) 
p{wi)  =  0  and  p(w2)  =  1. 

3.  Moves  assignment. 

(a)  ri(s)  =  {1,  2, . . . ,  |OptSupps(s)|}  and  F2(s)  = 
{  02  }.  Note  that  every  s  G  Sr  is  a  player-1 
turn-based  state. 

(b)  ri((s,i))  =  {  i  }  U  (ri(s)  \  Ji),  where 

^tSupps(s)  =  {  71, 72,- ..,7"  })  and 

r2((s,  i))  =  r2(s).  At  state  (s,i)  all  the  moves 
in  7i  are  collapsed  to  one  move  i,  and  all  the 
moves  not  in  7^  are  still  available. 

4.  Transition  function. 

(a)  The  states  wi  and  W2  are  absorbing  states. 
Observe  that  player  1  has  value  1  at  state  wi, 
and  value  0  at  state  W2. 

(b)  For  any  state  s  we  have  S{s,i,a2){(s,i))  =  1: 
at  state  s,  player  1  can  decide  which  element 
of  OptSupps(s)  to  play,  and  if  player  1  chooses 
move  i  the  game  proceed  to  state  (s,i). 

(c)  Transition  function  at  state  (s,i).  Let 
OptSupps(s)  =  {7i,72,...,7„  }. 

i.  For  all  moves  02  G  r2(s),  if  there  is  oi  G 
7i  such  that  “i’ n2)(s')  > 

0,  then  5{(s,  i),  i,  a2){wi)  =  1. 


The  above  transition  specifies  that,  when 
a  pair  of  moves  01,02  with  oi  G  71  is 
played,  if  the  game  G  proceeds  with  posi¬ 
tive  probability  to  a  different  value  class, 
then  the  game  Gr  proceeds  to  the  state 
roi,  which  has  value  1  for  player  1.  Note 
that  since  oi  G  71  and  71  G  OptSupps(s), 
if  the  game  G  proceeds  to  a  different  value 
class  with  positive  probability,  it  proceeds 
to  Ufc>r^^(^)  positive  probability, 
ii.  For  all  moves  02  G  r2(s),  if  for  all  oi  G  71 
we  have  X;s'evc(r) n2)(s')  =  1, 
then  S{{s,i),i,a2){s')  =  Eaie7i?i(«i)  ' 
5(s,  oi,  a2)(s');  where  is  a  locally  op¬ 
timal  selector  with  Supp(^f)  =  7^. 
hi.  For  all  moves  oi  G  (ri(s)  \  7^)  and 
02  S  F2(s)  we  let  5((s,  i),  oi,  a2)(s')  = 
5(s,  oi,  a2)(s')  for  s'  G  VC(r);  fur¬ 
thermore,  we  let  S{(s,i),ai,a2)(w2)  = 

EsVVC(r)<^(s>ai-a2)(s'). 

Lemma  3.6.  For  all  0  <  r  <  1  and  all  s  G  VC(r),  the 
state  s'  is  limit-sure  winning  for  player  1  in  the  game 
Gr,  i.e.,  from  state  s  player  1  can  win  with  probability 
arbitrarily  close  to  1. 

Limit-sure  witness.  The  witness  strategy  for  a  limit- 
sure  game  constructed  in  [7]  consists  of  two  parts:  a 
ranking  function  of  the  states,  and  a  ranking  function 
of  the  actions  at  a  state.  These  ranking  functions  were 
described  by  a  /i-calculus  formula.  At  the  round  fc  of  a 
play,  the  witness  strategy  a  plays  at  a  state  s  the  actions 
with  least  rank  with  positive-bounded  probabilities,  and 
the  other  actions  with  vanishingly  small  probabilities 
as  £  ^  0.  Hence,  the  strategy  a  can  be  described  as 
(7  =  (1  —  ek)<yi  -\-  Sk  •  Odisk),  where  ui  is  a  memory  less 
strategy  such  that,  at  each  state  s,  Supp(ct£(s))  is  the  set 
of  actions  with  least  rank  at  s.  We  denote  by  limit-sure 
witness  move  set  the  set  of  actions  with  the  least  rank, 
i.e.,  at  each  s  the  set  of  moves  Supp (0-^(5)).  It  follows 
from  the  above  construction  that  as  e  ^  0,  the  limit- 
sure  winning  strategy  a  converges  to  the  memoryless 
selector  cr^. 

Lemma  3.7.  In  the  game  Gr,  there  is  a  limit- sure  win¬ 
ning  strategy  with  support  z  G  {  1,  2, . . . ,  |OptSupps(s)|  } 
at  s',  and  with  limit-sure  witness  move  set  7i  at  (s,i). 

Definition  3.2.  (Value-class  qualitative  e- 
OPTIMAL  strategies)  For  e  >  0,  a  strategy 
is  a  value-class  qualitative  e-optimal  strategy  for 
a  value-class  VC(r),  with  0  <  r  <  1,  if  (a) 
is  locally  e-optimal,  and  (b)  for  all  nodes  x  in 


with  (x)  G  VC(r)  and  all  tt  £  H  we  have 
Pr^'’^(r2es  I  Safe(yC{r)))  >  1  —  e.  A  strategy  is 
value-class  qualitative  e-optimal  if  it  is  value-class 
qualitative  s- optimal  for  all  value  classes  VC(r),  for  all 
0  <  r  <  1.  I 

Lemma  3.8  states  that  the  value-class  qualitative  £- 
optimal  strategies  for  different  value  classes  can  be 
“stitched”  or  composed  together  to  produce  a  perennial 
£-optimal  strategy.  This  allows  us  to  produce  witness 
for  individual  value  classes  and  compose  them  to  obtain 
a  witness  for  perennial  £-optimal  strategy.  The  key 
argument  is  as  follows:  given  a  value-class  qualitative 
£-optimal  strategy  for  any  strategy  tt  for  player  2  if 
the  game  stays  in  a  value  class  then  player  1  wins  with 
probability  at  least  1  —  £;  otherwise,  the  game  leaves  the 
value  class  according  to  the  locally  £-optimal  strategy, 
and  reaches  Wi  with  probability  at  least  the  value  of 
the  game,  within  £-precision. 

Lemma  3.8.  (Stitching  Lemma)  Let  ae  be  a  value- 
class  qualitative  e-optimal  strategy  that  is  also  perennial 
e-optimal  for  all  states  in  Wi.  Then  is  a  perennial 
e-optimal  strategy. 

Theorem  3.1  follows  from  existence  of  memory¬ 
less  limit-sure  winning  strategies  for  concurrent  games 
with  coBiichi  objectives  [7]  and  the  existence  of  peren¬ 
nial  £-optimal  strategies  obtained  by  composing  value- 
class  qualitative  £-optimal  strategies  across  value  classes 
(Lemma  3.8). 

Theorem  3.1.  (Memoryless  £-optimal  strate¬ 
gies  FOR  COBuchi  objectives)  For  every  real  £  >  0, 
memoryless  e-optimal  strategies  exist  for  all  coBiichi  ob¬ 
jectives  on  all  concurrent  game  structures. 

Theorem  3.2  states  that  there  exist  perennial  £- 
optimal  strategies  that  in  the  limit  coincide  with  a 
locally  optimal  selector,  i.e.,  a  memoryless  strategy 
with  locally  optimal  selectors.  The  result  follows  from 
Lemma  3.8,  which  proves  the  existence  of  perennial  e- 
optimal  strategies  as  value-class  qualitative  £-optimal 
strategies,  and  from  the  properties  of  limit-sure  winning 
strategies. 

Theorem  3.2.  (Limit  of  £-optimal  strategies) 
For  all  concurrent  game  structures  with  parity  objec¬ 
tives,  for  every  real  £  >  0,  there  exists  a  perennial 
e-optimal  strategy  £  Eg  such  that  the  sequence  of 
the  strategies  converges  to  a  locally  optimal  selector 
a  as  e  ^  Q,  i.e.,  lirng^o  Te  =  u,  for  a  £  E^. 

Witness  for  perennial  £-optinial  strategies.  The 

witness  for  a  perennial  £-optimal  strategy  is  pre¬ 
sented  as  a  value-class  qualitative  £-optimal  strategy 


(from  Lemma  3.8).  The  existence  of  a  value-class  qual¬ 
itative  £-optimal  strategy  follows  from  Lemma  3.6  and 
Lemma  3.7.  The  witness  consists  of  the  limit-sure  win¬ 
ning  strategy  witness  in  the  game  G^,  for  all  0  <  r  <  1, 
and  of  a  locally  £-optimal  strategy.  The  witness  can  be 
described  as  follows: 

•  Limit-sure  witness.  The  limit-sure  witness  in  the 
game  Gr,  for  r  >  0,  is  constructed  as  the  witness 
described  in  [7].  Observe  that  the  game  Gr  can 
be  exponential  in  the  size  of  the  game  G,  since 
the  set  OptSupps(s)  can  be  exponential.  To  obtain 
efficient  polynomial  witness  we  make  the  following 
key  observation:  at  every  state  s'  there  is  a  pure 
memory  less  move  i  for  player  1  (Lemma  3.7)  in 
the  limit-sure  witness  strategy.  Hence  player  1 
constructs  a  game  G(,  such  that  every  state  s  there 
is  only  a  single  successor  (s,  i),  where  i  is  a  pure 
memory  less  move  in  the  limit-sure  witness  in  Gr- 
The  graph  G(.  is  linear  in  the  size  of  the  game 
G.  The  witness  in  state  (s,  i)  is  the  witness  as 
described  in  [7]:  the  witness  consists  of  a  ranking 
function  of  the  actions  and  a  ranking  function  of 
the  state  space.  The  witness  is  polynomial  and  can 
be  verified  in  polynomial  time  in  size  of  the  game 
graph. 

•  Locally  e-optimal  witness.  The  locally  £-optimal 
witness  consists  of:  the  values  of  the  game  at  all 
state  s,  within  £-precision  and  the  locally  optimal 
selector  a  £  T,^.  The  selector  a  may  specify  prob¬ 
abilities  that  are  irrational.  The  locally  optimal 
selector  a  is  £-approximated  by  a  fc-uniform  selec¬ 
tor  CTfe,  where  a  fc-uniform  selector  is  a  selector  such 
that  the  associated  probabilities  of  the  distribution 
are  multiple  of  4,j  <  fc.  It  follows  from  [5,  14], 
that  fc  is  polynomial  in  the  size  of  the  game  graph 
and  i .  The  strategy  Wk  must  satisfy  the  constraint 
that  Supp(CTfc)  is  exactly  the  set  of  actions  with  the 
least  rank  as  described  by  the  limit-sure  witness. 
The  verification  of  the  witness  can  be  achieved  in 
polynomial  time,  since  checking  local  optimality  in¬ 
volves  verifying  that  Wk  is  optimal  for  the  “one- 
step”  game  with  respect  to  the  values  at  every 
state. 

It  follows  from  above  that  there  are  polynomial  witness 
for  perennial  £-optimal  strategies  and  the  witness  can 
be  verified  in  polynomial  time.  This  shows  that  the 
values  of  concurrent  parity  games  can  be  decided  with 
in  £-precision  in  NP.  Since  concurrent  parity  games  are 
closed  under  complementation  the  decision  procedure 
is  also  in  coNP.  The  previous  best  known  algorithm  to 
approximate  values  is  triple  exponential  in  the  size  of 
the  game  graph  and  logarithmic  in  i  [8]. 
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Theorem  3.3.  (Complexity  of  concurrent  par¬ 
ity  games)  For  all  concurrent  game  structures  G,  for 
all  parity  objectives  and  flo,  and  for  all  rationals 
£  >  0, 

1.  for  all  rationals  r,  whether  {{l))vaii^e)i,s)  €  [r  — 
£,  r  -I-  £]  can  be  decided  in  NP  D  coNP; 

2.  the  value  functions  {{l))vaii^e)  and  {{2))vaii^o)  can 
be  approximated  with  precision  e-precision  in  time 
exponential  in  |G|  and  polynomial  in  p 
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